Computational models: Bottom-up and top-down aspects
نویسندگان
چکیده
Computational models of visual attention have become popular over the past decade, we believe primarily for two reasons: First, models make testable predictions that can be explored by experimentalists as well as theoreticians; second, models have practical and technological applications of interest to the applied science and engineering communities. In this chapter, we take a critical look at recent attention modeling efforts. We focus on computational models of attention as defined by Tsotsos & Rothenstein (2011): Models which can process any visual stimulus (typically, an image or video clip), which can possibly also be given some task definition, and which make predictions that can be compared to human or animal behavioral or physiological responses elicited by the same stimulus and task. Thus, we here place less emphasis on abstract models, phenomenological models, purely data-driven fitting or extrapolation models, or models specifically designed for a single task or for a restricted class of stimuli. For theoretical models, we refer the reader to a number of previous reviews that address attention theories and models more generally (Itti & Koch, 2001a; Paletta et al., 2005; Frintrop et al., 2010; Rothenstein & Tsotsos, 2008; Gottlieb & Balan, 2010; Toet, 2011; Borji & Itti, 2012b). To frame our narrative, we embrace a number of notions that have been popularized in the field, even though many of them are known to only represent coarse approximations to biophysical or psychological phenomena. These include the attention spotlight metaphor (Crick, 1984), the role of focal attention in binding features into coherent representations (Treisman & Gelade, 1980a), and the notions of an attention bottleneck, a nexus, and an attention hand as embodiments of the attentional selection process (Rensink, 2000; Navalpakkam & Itti, 2005). Further, we cast the problem of modeling attention computationally as comprising at least three facets: guidance (that is, which computations are involved in deciding where or what to attend to next?), selection (how is attended information segregated out of other incoming sensory information?), and enhancement (how is the information selected by attention processed differently than non-selected information?). While different theories and models have addressed all three aspects, most computational models as defined above have focused on the initial and primordial problem of guidance. Thus guidance is our primary focus, and we refer the reader to previous reviews on selection and enhancement (Allport et al., 1993; Desimone & Duncan, 1995; Reynolds & Desimone, 1999; Driver & Frith, 2000; Robertson et al., 2003; Carrasco, 2011). Note that guidance of attention is often thought of as involving pre-attentive computations to attract the focus of attention to the next most behaviorally relevant location (hence, attention guidance models might — strictly speaking — be considered pre-attention rather than attention models). We explore models for exogenous (or bottom-up, stimulus-driven) attention guidance as well as for endogenous (or top-down, context-driven, or goal-driven) attention guidance. Bottom-up models process sensory information primarily in a feed-forward manner, typically applying successive transformations to visual features received over the entire visual field, so as to highlight those locations which contain the most interesting, important, conspicuous, or so-called salient information (Koch & Ullman, 1985; Itti & Koch, 2001a). Many, but not all, of these bottom-up models embrace the concept of a topographic saliency map, which is a spatial map where the map value at every location directly represents visual salience, abstracted from the details of why a location is salient or not (Koch & Ullman, 1985). Under the saliency map hypothesis, the task of a computational model is then to transform an image into its spatially corresponding saliency map, possibly also taking into account temporal relations between successive video frames of a movie (Itti et al., 1998). Many models thus attempt to provide an operational definition of salience in terms of some image transform or some importance operator that can be applied to an image and that directly returns salience at every location, as we further examine below.
منابع مشابه
A Comparative Study of Effect of Bottom-up and Top-down Instructional Approaches on EFL Learners’ Vocabulary Recall and Retention
This quasi-experimental study investigated the effect of bottom-up and top-down instructional approaches on English as a foreign language (EFL) vocabulary recall and retention. To this end, 44 high school students from two intact classes were assigned to bottom-up (n = 21) and top-down (n = 23) groups. The participants were exposed to 20 hours of explicit vocabulary instruction during 10 weeks ...
متن کاملEvent-Related Potentials of Bottom-Up and Top-Down Processing of Emotional Faces
Introduction: Emotional stimulus is processed automatically in a bottom-up way or can be processed voluntarily in a top-down way. Imaging studies have indicated that bottom-up and top-down processing are mediated through different neural systems. However, temporal differentiation of top-down versus bottom-up processing of facial emotional expressions has remained to be clarified. The present st...
متن کاملThe effect of bottom-up and top-down auditory program training on the development of children's auditory processing skills
Although there have been several previous investigations on the role of auditory training for the development of auditory processing skills, it still remains unknown whether children with auditory processing difficulties can get improved auditory skills after exposure to a multi-modal training experience comprising both visual and tactile stimuli. The present study, therefore, attempted to use ...
متن کاملThe effect of bottom-up and top-down auditory program training on the development of children's auditory processing skills
Although there have been several previous investigations on the role of auditory training for the development of auditory processing skills, it still remains unknown whether children with auditory processing difficulties can get improved auditory skills after exposure to a multi-modal training experience comprising both visual and tactile stimuli. The present study, therefore, attempted to use ...
متن کاملAn integrative top-down and bottom-up qualitative model construction framework for exploration of biochemical systems
Computational modelling of biochemical systems based on top-down and bottom-up approaches has been well studied over the last decade. In this research, after illustrating how to generate atomic components by a set of given reactants and two user pre-defined component patterns, we propose an integrative top-down and bottom-up modelling approach for stepwise qualitative exploration of interaction...
متن کاملEffect of Metacognitive Strategy Training and Perfectionism on Listening Comprehension Sub-Processes
The present study aimed to examine any possible relevance of perfectionism as a personal trait variable, in moderating the effectiveness of meta-cognitive instruction on bottom-up and top-down sub-processes of listening comprehension with a sample of EFL learners in Iranian context. To this end, 94 female EFL learners were selected from among 136 EFL learners at Andisheh Language Institute in M...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1510.07748 شماره
صفحات -
تاریخ انتشار 2012